Neural Network for Principle of Least Action

The principle of least action is the cornerstone of classical mechanics, theory of relativity, quantum mechanics, and thermodynamics. Here, we describe how a neural network (NN) learns to find the trajectory for a Lennard-Jones (LJ) system that maintains balance in minimizing the Onsager–Machlup (OM) action and maintaining the energy conservation. The phase-space trajectory thus calculated is in excellent agreement with the corresponding results from the “ground-truth” molecular dynamics (MD) simulation. Furthermore, we show that the NN can easily find structural transformation pathways for LJ clusters, for example, the basin-hopping transformation of an LJ38 from an incomplete Mackay icosahedron to a truncated face-centered cubic octahedron. Unlike MD, the NN computes atomic trajectories over the entire temporal domain in one fell swoop, and the NN time step is a factor of 20 larger than the MD time step. The NN approach to OM action is quite general and can be adapted to model morphometrics in a variety of applications.

Molecular dynamics simulation was initiated from the face-centered cubic (FCC) structure. The reduced number density of the system is 0.787 and the reduced time unit ps. Periodic = 2.068 boundary conditions were imposed, and equations of motion were integrated with the velocity-Verlet algorithm using a timestep of 2 fs. The system was equilibrated for 2 ns at a reduced temperature of 0.695. We use the coordinates of the particles in the equilibrated system as the initial configuration for the NN simulation. To obtain a final state for our boundary-value problem, we run MD simulations using the same starting state.  Table S1: Lennard Jones interaction parameters and lattice constant.

Model details
As shown in Figure 1 in the main text, the architecture of the NN model is a feedforward single-hidden-layer structure with one input and m output units. The output of the network can be written as: where the superscript "0" labels the parameters in hidden layer and "1" is for the parameters in output layer. The activation function is only applied to hidden units. We use this : ↦ network to predict the trajectory from coupled Newton's PDEs for many particles. The output dimension m equals the simulation dimensionality multiplied by the total number of particles. In the case of 16 particle and 3-dimensional simulation, out output size is 48, i.e., given the boundary conditions and the discretized time grid points, we can produce the trajectories of all 16 particles simultaneously using the NN model. Note that the hyper-parameters 1 , 2 Figure S3 shows how the RMSD scales with different number of particles in the system, where we consider 32, 64, 108 and 500-atom systems and an NN architecture with 64 units in the hidden layer. RMSD of the 500-particle system is approximately 1 order of magnitude higher than that of 100-particle system.

Pre-training
Overlapping of particle trajectories will cause large amount of loss due to LJ potential. To avoid this at the parameter initialization stage, we have employed a pre-training strategy as S-6 follows: we first randomize the parameters using Gaussian distribution, then with the { , } knowledge of the initial and final state of the system, we pre-train the network using the following loss function: Although this pre-training method has led to interesting results as presented in the main text, there is still a chance that particles may get too close or even cross their trajectories after pretraining, thus give rise to a tremendous energy loss at the beginning of the main training iteration.
One way to avoid this during the pre-training is to add constraints to limit the minimum interparticle distance in Eq. S2, which will be discussed in future work.

Supplementary results
In addition to the 500-atom LJ system presented in the main text, we studied systems with 108 and 256 atoms. In the next two figures, we compare NN results against MD simulations for 256 atoms.
S-7 Figure S4 visualizes four randomly chosen trajectories generated by the NN (blue), which are coincident with the "ground-truth" MD trajectories (red). The averaged root-mean-square error between the NN and MD trajectories is on the order of 10 -4 in LJ units.    Figure S8 shows how the root-mean-square deviation between the MD and NN trajectories for the 108-atom system changes with number of epochs and simulation time. Figure S8: Root-mean-square deviation (RMSD) between the MD and NN trajectories for a 108-atom LJ system. The deviation of the NN trajectory from MD decreases with the number of epochs and changes with time. The boundary constraints in the loss function ensure that the error is much smaller near the initial and final configurations than in the middle of the time domain. The magnitude of the deviation is below 10 -4 in LJ units.
As a simple example of another application of our method, we consider a process in which the center atom in a 2D LJ cluster moves to the surface. 3,4 Figure S9(a) shows the transformation S-10 pathway generated by the NN from the initial to the final state. The cluster goes through some intermediate states, which agree with the sampled results from the references mentioned above.
S8(b) shows the total and potential energy profiles of the system as a function of time. The standard deviation of the total energy is 0.007 in LJ units, less than 0.6% of the energy value. Figure S9: (a) transition path of a 2D cluster of 7 atoms for the migration of the center atom to the surface. In reduced LJ units, the transition takes place within 1  and the timestep is 0.01 . (b) Total energy (red) and potential energy per atom as a function of time.